I–vector transformation and scaling for PLDA based speaker recognition

نویسندگان

Sandro Cumani

Pietro Laface

چکیده

This paper proposes a density model transformation for speaker recognition systems based on i–vectors and Probabilistic Linear Discriminant Analysis (PLDA) classification. The PLDA model assumes that the i-vectors are distributed according to the standard normal distribution, whereas it is well known that this is not the case. Experiments have shown that the i–vector are better modeled, for example, by a Heavy–Tailed distribution, and that significant improvement of the classification performance can be obtained by whitening and length normalizing the i-vectors. In this work we propose to transform the i–vectors, extracted ignoring the classifier that will be used, so that their distribution becomes more suitable to discriminate speakers using PLDA. This is performed by means of a sequence of affine and non–linear transformations whose parameters are obtained by Maximum Likelihood (ML) estimation on the training set. The second contribution of this work is the reduction of the mismatch between the development and test i–vector distributions by means of a scaling factor tuned for the estimated i– vector distribution, rather than by means of a blind length normalization. Our tests performed on the NIST SRE-2010 and SRE-2012 evaluation sets show that improvement of their Cost Functions of the order of 10% can be obtained for both evaluation data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PLDA Modeling in I-Vector and Supervector Space for Speaker Verification

In this paper, we advocate the use of uncompressed form of ivector. We employ the probabilistic linear discriminant analysis (PLDA) to handle speaker and session variability for speaker verification task. An i-vector is a low-dimensional vector containing both speaker and channel information acquired from a speech segment. When PLDA is used on i-vector, dimension reduction is performed twice – ...

متن کامل

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation

Restricted Boltzmann Machines (RBMs) have shown success in different stages of speaker recognition systems. In this paper, we propose a novel framework to produce a vector-based representation for each speaker, which will be referred to as RBMvector. This new approach maps the speaker spectral features to a single fixed-dimensional vector carrying speaker-specific information. In this work, a g...

متن کامل

STC Speaker Recognition System for the NIST i-Vector Challenge

This paper presents a Speech Technology Center (STC) system submitted to the NIST i-vector Challenge. The system includes different subsystems based on PLDA, LDA-SVM, RBM-PLDA and DBN-PLDA. We propose an original iterative scheme for clustering the NIST i-vector Challenge devset. We also introduce the RBM-PLDA subsystem in the NIST i-vector Challenge. Experiments performed on the progress datas...

متن کامل

PLDA in the I-Supervector Space for Text-Independent Speaker Verification

In this paper, we advocate the use of the uncompressed form of i-vector and depend on subspace modeling using probabilistic linear discriminant analysis (PLDA) in handling the speaker and session (or channel) variability. An i-vector is a low-dimensional vector containing both speaker and channel information acquired from a speech segment. When PLDA is used on an i-vector, dimension reduction i...

متن کامل

I-Vector/PLDA Variants for Text-Dependent Speaker Recognition

The i-vector/PLDA approach currently dominates the field of text-independent speaker recognition and the question of how to translate this methodology to the text-dependent domain has recently become an active area of research. The essential difference between the two fields is that it is possible to do speaker recognition with enrollment and test utterances of very short duration in the text-d...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

I–vector transformation and scaling for PLDA based speaker recognition

نویسندگان

چکیده

منابع مشابه

PLDA Modeling in I-Vector and Supervector Space for Speaker Verification

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation

STC Speaker Recognition System for the NIST i-Vector Challenge

PLDA in the I-Supervector Space for Text-Independent Speaker Verification

I-Vector/PLDA Variants for Text-Dependent Speaker Recognition

عنوان ژورنال:

اشتراک گذاری